Chapter 1 

An overview of the goodness-of-fit test problem 
for copulas. 



Jean-David Fermanian 



Abstract We review the main "omnibus procedures" for goodness-of-fit testing for 
copulas: tests based on the empirical copula process, on probability integral trans- 
formations, on Kendall's dependence function, etc, and some corresponding reduc- 
tions of dimension techniques. The problems of finding asymptotic distribution-free 
test statistics and the calculation of reliable /^-values are discussed. Some partic- 
ular cases, like convenient tests for time-dependent copulas, for Archimedean or 
extreme-value copulas, etc, are dealt with. Finally, the practical performances of the 
proposed approaches are briefly summarized. 



1.1 Introduction 

Once a model has been stated and estimated, a key question is to check whether 
the initial model assumptions are realistic. In other words, and even it is sometimes 
eluted, every modeler is faced with the so-called "goodness-of-fit" (GOF) problem. 
This is an old-dated statistical problem, that can be rewritten as: denoting by F the 
cumulative distribution function (cdf hereafter) of every observation, we would like 
to test 

: F = F , against J^ a : F ^ F , 
for a given cdf Fq, or, more commonly, 

J^ :Fe^, against Jf a :F^^, 

for a given family of distributions & := {Fg,9 <G ©}. This distinction between sim- 
ple and composite assumptions is traditional and we keep it. Nonetheless, except in 
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some particular cases (test of independence, e.g.), the latter framework is a lot more 
useful than the former in practice. 

Some testing procedures are "universal" (or "omnibus"), in the sense they can 
be applied whatever the underlying distribution. In other terms, they do not depend 
on some particular properties of Fq or of the assumed family & ' . Such tests are of 
primary interest for us. Note that we will not consider Bayesian testing procedures, 
as proposed in J54), for instance. 

To fix the ideas, consider an i.i.d. sample (Xi , . . . ,X„) of a af-dimensional random 
vector X. Its joint cdf is denoted by F, and the associated marginal cdfs' by Fj, 
j = l,...,d. Traditional key quantities are provided by the empirical distribution 
functions of the previous sample: for every x £ M. d , set d marginal cdfs' 

n 

F„,k(x k ) ■■=n- l Y J l{Xi,k< x k), k=l,...,d, 
1=1 

and the joint empirical cdf F„(x) := n~ l Y!!=i 1(X; < x). The latter inequality has to 
be understood componentwise. Most of the "omnibus" tests are based on transfor- 
mations of the underlying empirical distribution function, or of the empirical pro- 
cess F„ := y/n(F n — Fq) itself: T„ = y/„(F„) or T„ = y/„(¥ n ). It is the case of the 
famous Kolmogorov-Smirnov (KS), Anderson-Darling (AD), Cramer-von-Mises 
(CvM) and chi-squared tests, for example. 

Naively, it could be thought the picture is the same for copulas, and that straight- 
forward modifications of standard GOF tests should do the job. Indeed, the problem 
for copulas can be simply written as testing 

Jt?o : C = Co, against Jf a : C 7^ Co, 

J4?o : C £ V, against J4? a : C % ', 

for some copula family c € := {Cq ,9 £0}. Moreover, empirical copulas, introduced 
by Deheuvels in the 80's (see ll23l . Il24l . l25l ) play the same role for copulas as 
standard empirical cdfs' for general distributions. For any u £ [0, \] d , they can be 
defined by 

C n {n)x=F n {F^\u l ) i ...,F^\u d )), 
with the help of generalized inverse functions, or by 

1 " 

C„(u) := - Y l(F„,i(X, u ) < ui,...,F nd {X i4 ) < u d ). 

It can be proved easily that ||C„ — C n ||oo < drT 1 (see J35]). Then, for the purpose of 
GOF testing, working with C„ or C„ does not make any difference asymptotically. 
In every case, empirical copulas are explicit functionals of the underlying empirical 
cdf: C„ = £(F„). Thus, any previous GOF test statistics for copulas could be defined 
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as T n = \]/ n (C„) = \j/ n o £(F„). But this functional £ is sufficient to induce significant 
technical difficulties, when applied to standard statistical procedures. 

Actually, the latter parallel applies formally, but strong differences appear in 
terms of the limiting laws of the "copula-related" GOF test statistics. Indeed, some 
of them are distribution-free in the standard case, i.e., their limiting laws under the 
null do not depend on the true underlying law F, and then, they can be tabulated: KS 
(in the univariate case), chi-squared tests, for example. Unfortunately, it is almost 
impossible to get such nice results for copulas, due to their multivariate nature and 
due to the complexity of the previous mapping between F„ and C„. Only a few GOF 
test techniques for copulas induce distribution-free limiting laws. Therefore, most 
of the time, some simulation-based procedures have been proposed for this task. 

In section [L2l we discuss the "brute-force" approaches based on some distances 
between the empirical copula C„ and the assumed copula (under the null), and we 
review the associated bootstrap-like techniques. We detail how to get asymptoti- 
cally distribution-free test statistics in section [T31 and we explain some testing pro- 
cedures that exploit the particular features of copulas. We discuss some ways of 
testing the belonging to some "large" infinite-dimensional families of copulas like 
Archimedean, extreme-value, vine, or HAC copulas in section 11.41 Tests adapted 
to time-dependent copulas are introduced in section 11.51 Finally, empirical perfor- 
mances of these GOF tests are discussed in section IT~6l 



1.2 The "brute-force" approach: the empirical copula process 
and the bootstrap 

1.2.1 Some tests based on empirical copula processes 

Such copula GOF tests are the parallels of the most standard GOF tests in the litera- 
ture, replacing F„ (resp. Fq) by C„ (resp. Co). These statistics are based on distances 
between the empirical copula C„ and the true copula Co (simple zero assumption), 
or between C„ and Cg (composite zero assumption), for some convergent and con- 
venient estimator Q„ of the "true" copula parameter Oq. It is often reduced simply to 
the evaluation of norms of the empirical copula process C„ := \fn(C n — Co), or one 
of its approximations C„ := \fn(C n — Ca ). 

In this family, let us cite the Kolmogorov-Smirnov type statistics 



T 

1 j; 



■KS . 



sup |Vn(C n -Co)(u)|, 



n 



and the Anderson-Darling type statistics 
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for some positive (possibly random) weight function w n , and their composite ver- 
sions. By smoothing conveniently the empirical copula process, |70l defined alter- 
native versions of the latter tests. 

In practice, the statistics T„ seem to be less powerful than a lot of competitors, 
particularly of the type T^ D (see P31 ). Therefore, a "total variation" version of 
T t f s has been proposed in ll36l . that appears significantly more powerful than the 
classical T,f s : 

Tf v := sup £10,(^)1, or If* ;= sup £ \C n {B k )\, 

Bl B Lnk=l B u ....B Lnk=l 

for simple or composite assumptions respectively. Above, the supremum is taken 
over all disjoint rectangles B\ , . . . ,Z?£ n C [0, \] d , and L„ ~ Inn. 

Another example of distance is proposed in lf7Tl : let two functions fa and fa in 
M. d . Typically, they represent copula densities. Set a positive definite bilinear form 

as 

<fi,fa >:= J K d (x u x 2 )fa(x 1 )fa(x 2 )dx 1 dx 2 , 

where Kj(xj,X2) := exp( — ||xi — X2W 2 /(2dh 2 )), for some Euclidian norm || ■ || in 
R'' and a bandwidth h > 0. A squared distance between fa and fa is given sim- 
ply by H{fufa) ■=< fa -fa, fa -fa >=< fa, fa >-2<fa,fa> + < h,h > ■ 
When fa and fa are the copula densities of C\ and C2 respectively, the three lat- 
ter terms can be rewritten in terms of copula directly. For instance, <fa,fa >= 
f )f ( /(xi,X2) C\(dx\)C2{dx2). Since such expressions have simple empirical coun- 
terparts, a GOF test for copulas can be built easily: typically, replace C\ by the 
empirical copula C„ and C2 by the true copula Co (or ). 

Closely connected to this family of tests are statistics T n that are zero when the 
associated copula processes are zero, but not the opposite. Strictly speaking, this is 
the case of the Cramer-von Mises statistics 

r„ c, ' M :=«|(C n -Co) 2 (u)C„(rfu), 

and of chi-squared type test statistics, like 

Tf":=nf t w k (C,,-C ) 2 (B k ), 

k=\ 

where B\ , . . . ,B p denote disjoint boxes in [0, l] d and w k , k= l,...,p are convenient 
weights (possibly random). More generally, we can consider 

T» := £ n(Cn{E k ),Co(E k )), or T£ := £ fl(C n (E k ),C §n (E k )), 
k=i k=i 
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for any metric fx on the real line, and arbitrary subsets E\,... ,E p in [0, 1] . This is 
the idea of the chi-square test detailed in ll30l : set the vectors of pseudo-observations 
Uj := (F„ i(X,- 1), . . . ,F„ ii(Xiii)), and a partition of [0, l] d into p disjoint rectangles 
Bj. The natural chi-square-style test statistics is 

TX . = a (fik-pM) 2 

k=\ npk(0„) 

where denotes the number of vectors U,-, i= 1 ,...,« that belong toB^, and pk(0) 
denotes the probability of the event {U E B^} under the copula Cq. This idea of 
applying an arbitrary categorization of the data into contingency tables [0, l] d has 
been applied more or less fruitfully in a lot of papers: |46l , ||59l , 11331 , (4|, (58), etc. 

Finally, note that a likelihood ratio test has been proposed in l30l . based on a 
Kullback-Leibler pseudo distance between a "discrete" version of C„ and the corre- 
sponding estimated copula under the null: 

Tn R -= j^NklnpkiOn). 

k=\ 

To compare the fit of two potential parametric copulas, the same information cri- 
terion has been used in (28) to build a similar test statistics, but based on copula 
densities directly. 

The convergence of all these tests relies crucially on the fact that the empirical 
copula processes C„ and C„ are weakly convergent under the null, and for conve- 
nient sequences of estimates 6„: see IBD . l38ll . [35). Particularly, it has been proved 
that C„ tends weakly in £°°([0,l] d ) (equipped with the metric induced by the sup- 
norm) to a Gaussian process Gc , where 

d 

G Co (u) := B Co (u) - £ d j C {u)M Co {u j ,l. j ), Vu 6 [0, l] d , 

.7=1 

with obvious notations and for some of-dimensional Brownian bridge B in [0, \} d , 
whose covariance is 

E[G Co (u)G Co (v)] =Co(uAv)-Co(u)C (v), V(u,v) e [0, l] 2d . 

To get this weak convergence result, it is not necessary to assume that Co is con- 
tinuously differentiable on the whole hypercube [0, l] d , a condition that is often not 
fulfilled in practice. Recently, ll87l has shown that such a result is true when, for 
every j = 1, . . . ,d, BjCq exists and is continuous on the set {u£ [0, l] d ,0 < uj < 1}. 

Clearly, the law of G involves the particular underlying copula Co strongly, con- 
trary to usual Brownian bridges. Therefore, the tabulation of the limiting laws of 
T n GOF statistics appears difficult. A natural idea is to rely on computer intensive 
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methods to approximate these law numerically. The bootstrap appeared as a natural 
tool for doing this task 



1.2.2 Bootstrap techniques 

The standard nonparametric bootstrap is based on resampling with replacement in- 
side an original i.i.d. X-sample S\- We get new samples = (Xj, . . . ,X*). Asso- 
ciate to every new sample its "bootstrapped" empirical copula C* and its boot- 
strapped empirical process C* := \fn(C* n — C n ). In [35), it is proved that, under mild 
conditions, this bootstrapped process C* is weakly convergent in ^°°([0, l] rf ) towards 
the previous Gaussian process Gc - Therefore, in the case of simple null assump- 
tions, we can get easily some critical values or p-values of the previous GOF tests: 
resample M times, M >> 1, and calculate the empirical quantiles of the obtained 
bootstrapped test statistics. Nonetheless, this task has to be done for every zero as- 
sumption. This can become a tedious and rather long task, especially when d is 
"large" (> 3 in practice) and/or with large datasets (> 1000, typically). 

When dealing with composite assumptions, some versions of the parametric 
bootstrap are advocated, depending on the limiting behavior of Q„ — 9$: see the 
theory in B4l . and the appendices in l45l for detailed examples. To summarize 
these ideas in typical cases, it is now necessary to draw random samples from Cq . 
For every bootstrapped sample, calculate the associated empirical copula C* and a 
new estimated value 6* of the parameter. Since the weak limit of \/n(C* — Cg* ) is 

the same as the limit of C„ = y/n(C„ — C» ), the law of every functional of C„ can 
be approximated. When the cdf cannot be evaluated explicitly (in closed-form), 
a two-level parametric bootstrap has been proposed in l44l . by bootstrapping first a 
approximated version of Cg . 

Instead of resampling with replacement, a multiplier bootstrap procedure can 
approximate the limiting process Gq. (or one of its functionals), as in l86l : consider 
Zi,...,Z„ i.i.d. real centered random variables with variance one, independent of 
the data Xi , . . . ,X„. A new bootstrapped empirical copula is defined by 

:= -t,ZiMFnA x Ui) < Ui,...,F„ !d (X i4 ) < u d ), 

for every u € [0, \} d . Setting Z„ := n _1 ^" =1 Z,-, the process j3„ := \/n(C* —Z n C„) 
tends weakly to the Brownian bridge Bq, . By approximating (by finite differences) 
the derivatives of the true copula function, it is shown in [86 1 how to modify j3„ to 
get an approximation of Gq,- To avoid this last stage, another bootstrap procedure 
has been proposed in Ifl4l . It applies the multiplier idea to the underlying joint and 
marginal cdfs', and invoke classical delta method arguments. Nonetheless, despite 
more attractive theoretical properties, the latter technique does not seem to improve 
the initial multiplier bootstrap of ll86l . In ll62l . the multiplier approach is extended 



1 Copula GOF tests 



7 



to deal with parametric copula families of any dimension, and the finite-sample 
performance of the associated Cramer-von-Mises test statistics has been studied. A 
variant of the multiplier approach has been proposed in l60l . It is shown that the 
use of multiplier approaches instead of the parametric bootstrap leads to a strong 
reduction in the computing time. Note that both methods have been implemented in 
the copula R package. 

Recently, in l36l . a modified nonparametric bootstrap technique has been intro- 
duced to evaluate the limiting law of the previous Komogorov-Smirnov type test 
statistics 7Jp in the case of composite zero assumptions. In this case, the key pro- 
cess is still 

C„ := \fn(C„ - Cg n ) = C„- Vn(C^ - C 6o ). 

Generate a usual nonparametric bootstrap sample, obtained after resampling with 
replacement from the original sample. This allows the calculation of the boot- 
strapped empirical copula C* and a new parameter estimate 9*. Instead of con- 
sidering the "intuitive" bootstrapped empirical copula process \fn(C* n — Cg„), a new 
bootstrapped process is introduced: 

Y*:=V^(C*-C„)-^(C^-C 4 ). 

Indeed, the process \fn(C* n — Cs«), while perhaps a natural candidate, does not yield 

a consistent estimate of the distribution of C,„ contrary to Y*. For the moment, the 
performances of this new bootstrapped process have to be studied more in depth. 



1.3 Copula GOF test statistics: alternative approaches 
1.3.1 Working with copula densities 

Even if the limiting laws of the empirical copula processes C„ and C„ involve the 
underlying (true) copula in a rather complex way, it is still possible to get asymp- 
totically distribution-free test statistics. Unfortunately, the price to be paid is an 
additional level of complexity. 

To the best of our knowledge, there exists a single strategy. The idea is to rely on 
copula densities themselves, rather than copulas (cdfs'). Indeed, testing the identity 
C = Co is equivalent to studying the closeness between the true copula density To 
(w.r.t. the Lebesgue measure on [0, 1] , that is assumed to exist) and one of its es- 
timates T„. In l33ll . a L 2 -distance between X n and To allows to build convenient test 
statistics. To be specific, a kernel estimator of a copula density T at point u is defined 
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where U; := (F n i(Xi 1 ),..., F n d(Xi^)) for all i = l,...,n. Moreover, K is a d- 
dimensional kernel and h = h(n) is a bandwidth sequence, chosen conveniently. Un- 
der some regularity assumptions, for every m and every vectors Ui, . . . ,U m in ]0, 1 [ d , 
such that To(ua) > for every k, the vector (nh d ) l l 2 ((t„ — lb) (ui ),..., (t„ — To)(u m )) 
tends weakly to a Gaussian random vector, whose components are independent. 
Therefore, under the null, the test statistics 

nh d " (T„(u,)-T (u,)) 2 

" JK 2 ^ T (U,)2 

tends in law towards a m-dimensional chi-squared distribution. This can be adapted 
easily for composite assumptions. The previous test statistics depend on a finite and 
arbitrary set of points u^, k = l,...,m. To avoid this drawback, 0331 has introduced 

/„ = J(T„-K h *T) 2 (u)(D(ll)du, 

for some nonnegative weight function co. Here, f denotes To (simple assumption) 
or t(-, 9„) (composite assumption), for sufficiently regular estimates 9 n of 9q. It is 
proved that 

! _ n 2 h d (J„ - (nh d )- 1 f K 2 (t).(ico)(n- ht)dtdu + (nfi)' 1 J i 2 (Q.Z d r=l jK 2 f 
" '~ 2 / f 2 a ■ J { J K(u)K(a. + v) du} 2 dv 

tends to a ^ 2 (1) under the null. 

Even if the previous test statistics are pivotal, they are rather complex and require 
the choice of smoothing parameters and kernels. Nonetheless, such ideas have been 
extended in ll85l to deal with the fixed design case. Moreover, the properties of these 
tests under fixed alternatives are studied in |13|. The impact of several choices of 
parameter estimates 0„ on the asymptotic behavior of J n is detailed too. Apparently, 
for small sample sizes, the normal approximation does not provide sufficiently exact 
critical values (in line with BTI or (32]), but it is still possible to use a parametric 
bootstrap procedure to evaluate the limiting law of T n T in this case. Apparently, in 
the latter case, the results are as good as the main competitors (see lfT3l . section 5). 

Since copula densities have a compact support, kernel smoothing can generate 
some undesirable boundary effects. One solution is to use improved kernel estima- 
tors that take care of the typical corner bias problem, as in l70| . Another solution 
is to estimate copula densities through wavelets, for which the border effects are 
handled automatically, due to the good localization properties of the wavelet basis: 
see l43l . This idea has been developed in fl39l , in a minimax theory framework, to 
determine the largest alternative for which the decision remains feasible. Here, the 
copula densities under consideration are supposed to belong to a range of Besov 
balls. According to the minimax approach, the testing problem is then solved in an 
adaptive framework. 
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1.3.2 The probability integral transformation (PIT) 

A rather simple result of probability theory, proposed initially in ll80l . has attracted 
the attention of authors for copula GOF testing purpose. Indeed, this transforma- 
tion maps a general of-dimensional random vector X into a vector of d independent 
uniform random variables on [0, 1] in a one-to-one way. It is known as Rosenblatt's 
or probability integral transformation (PIT). Once the joint law of X is known and 
analytically tractable, this is a universal way of generating independent and uniform 
random vectors without losing statistical information. Note that other transforma- 
tions of the same type exist (see (22)). 

To be specific, the copula C is the joint cdf of U := (F\ iX\), . . . ,F d (Xd)). We 
define the li-dimensional random vector V by 

Vi :=C/i=fi(Zi), V 2 :=C{U 2 \Ui),--- ,V d :=C{U d \Ui,...,U d ^), (1.1) 

where C(-\u\,. . . is the law of E4 given U\ = u\,...,Uk-\ = "A-b k = 2,...,d. 
Then, the variables Vj-, k = 1 , . . . , d are uniformly and independently distributed 
on [0, 1]. In other words, U ~ C iff V = 38 (\S) follows the c/-variate independence 
copula Cj_(u) = u\ . ■ ■ ■ .uj. 

The main advantage of this transformation is the simplicity of the transformed 
vector V. This implies that the zero assumptions of a GOF test based on V are 
always the same: test the i.i.d. feature of V, that is satisfied when C is the true 
underlying copula. A drawback is the arbitrariness in the choice of the successive 
margins. Indeed, there are at most dl different PITs', that induce generally different 
test statistics. Another disadvantage is the necessity of potentially tedious calcula- 
tions. Indeed, typically, the conditional joint distributions are calculated through the 
formulas 

C(«jfe|«i,. . . ,Uk-\) = d\ 2 X ....k-\C{u\,- .. ,Mjfc,l,. . . , I) / d^ 2 ^ ^C^i, . . . ,Mjt_i, 1,. . . , 1), 

for every k = 2,... ,d and every u G [0, l] d . Therefore, with some copula families 
and/or with large dimensions d, the explicit calculation (and coding!) of the PIT can 
become unfeasible. 

The application of such transformations for copula GOF testing appeared first 
in lfT2l . This idea has been reworked and extended in several papers afterwards: 
see OD . ifTOl . l40l . (8), etc. Several applications of such techniques to financial se- 
ries modelling and risk management has emerged, notably ll65l . lETl . Ifl9l . Il63l . 11921 . 
among others. 

For copula GOF testing, we are only interested in the copula itself, and the 
marginal distributions F^, k = l,...,d are seen as nuisance parameters. There- 
fore, they are usually replaced by the marginal empirical cdfs' F n £. Equivalently, 
the observations X,, i = 1 , . . . , n are often replaced by their pseudo-observations 
U/ := (F„,i(X iA ) 7 . . . ,F n l j(X jcl )). Moreover, for composite zero assumptions, the 
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chosen estimator 9„ disturbs the limiting law of the test statistics most of the time. 
This difficulty is typical of the statistics of copulas, and it is a common source of 
mistakes, as pointed out in l34l . For instance, in 1121 . these problems were not tack- 
led conveniently and the reported /^-values are incorrect. |[T2l noticed that the r.v. 
Yi = \[®~ 1 {Vk)] 2 follows a x 2 {d). But it is no more the case of lf =1 [<P~ l (9 n ,k)] 2 , 
where V = M{\T). This point has been pointed out in [44] . A corrected test statistics 
with reliable p- values has been introduced in ll3D . An extension of these tests has 
been introduced in iflOl . It implies data-driven weight functions, to emphasize some 
regions of underlying the copula possibly. Its comparative performances are studied 
in J9] and ®. 

Thus, to the best of our knowledge, all the previous proposed tests procedures 
have to rely on bootstrap procedures to evaluate the corresponding limiting laws 
under the null. This is clearly a shame, keeping in mind the simplicity of the law of 
V, after a PIT of the original dataset (but with known margins). In practice, we have 
to work with (transformed) pseudo-observations V,, i = 1, . . . ,n. As we said, they 
are calculated from formulas dl.lt . replacing unobservable uniformly distributed 
vectors U, by pseudo-observations U,, ; = 1,...,«. The vectors V, are no longer 
independent and only approximately uniform on [0, 1] . Nonetheless, test statistics 
T„' = v(Vi , . . . , V„) may be relevant, for convenient real functions \jf. In general 
and for composite zero assumptions, we are not insured that the law of V, denoted 
by Coo,\, tends to the independence copula. If we were able to evaluate Coo.v, a 
"brute-force" approach would still be possible, as in section [TT2l For instance and 
naively, we could introduce the Kolmogorov-type statistics 

T n KM - PIT := sup |l£l(V(<u)-G.,v(u)|. 

ue(o,i)<< n i=\ 

Nonetheless, due to the difficulty to evaluate precisely Co.v (by Monte-Carlo, in 
practice), most authors have preferred to reduce the dimensionality of the problem. 
By this way, they are able to tackle more easily the case d > 3. 



1.3.3 Reductions of dimension 

Generally speaking, in a GOF test, it is tempting to reduce the dimensionality of 
the underlying distributions, for instance from d to one. Indeed, especially when 
d >> 1, the "brute-force" procedures based on empirical processes involve signif- 
icant analytical or numerical difficulties in practice. For instance, a Cramer-von- 
Mises necessitates the calculation of a <f -dimensional integral. 

Formally, a reduction of dimension means replacing the initial GOF problem 
"J^o : the copula of X is Co" by "J$?q : the law of yz'(X) is G^,o", for some trans- 
formation y/ : R d — > W, with p « d, and for some p-dimensional cdf G^ o- 

As 

J$?o implies J$? g, we decide to reject J^q when J$?q is not satisfied. Obviously, 
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this reduction of the available information induces a loss of power, but the practical 
advantages of this trick often dominate its drawbacks. 

For instance, when p = 1 and if we are able to identify G^, , it becomes possi- 
ble to invoke standard univariate GOF test statistics, or even to use ad-hoc visual 
procedures like QQ-plots. Thus, by reducing a multivariate GOF problem to a uni- 
variate problem, we rely on numerically efficient procedures, even for high dimen- 
sional underlying distributions. However, we still depend on Monte-Carlo methods 
to evaluate the corresponding /^-values. Inspired by [84], we get one of the most 
naive method of dimension reduction: replace T t f s above by 

f n KS := L \C n (A a )-C (A a )\, or f/ s := £ \C n (A a ) -C §n (A a )\, 

cee(O.l) cee(O.l) 

where (A C( ) C(G (o i) is an increasing sequence of subsets in [0, l] d s.t. A a ={u£ 
[0, l] J |C (u) < a] and A a = {ue [0, l] d \C^ (u) < a}. 

To revisit a previous example and with the same notations, Oil considered par- 
ticular test statistics T„' based on the variables Z ( - := Y,t=i ^C^i'fc) -1 ' ' = 1, • • • 
If the margins k= l,...,d, and the true copula Co were known, then we were be 
able to calculate Z, := Y%=\ ^(^',jfc) _1 tnat tends in law towards a chi-square law of 
dimension d under the null. Since it is not the case in practice, the limiting law of Z, 
is unknown, and it has to be evaluated numerically by simulations. It is denoted by 
F$. Therefore, OTI propose to test 

J4?q : the asymptotic law of T^' is a given cdf Fy (to be estimated), 

where T^' PIT is defined by usual (univariate) Kolmogorov-Smirnov, Anderson- 
Darling or Cramer-von-Mises test statistics. For instance, 

T AD. PIT f ( F nl- F 0,z) 2 

where F n ^ is the empirical cdf of the pseudo sample Z\, . . . ,Z„. Note that F ^ an d 
F Q ^ depend strongly on the underlying cdf of X, its true copula Co, the way marginal 
cdfs' have been estimated to get pseudo-observations (empirical or parametric esti- 
mates) and possibly the particular estimate 0„. 

Beside the PIT idea, there exist a lot of possibilities of dimension reductions 
potentially. They will provide more or less relevant test statistics, depending on the 
particular underlying parametric family and on the empirical features of the data. For 
instance, in the bivariate case, Kendall's tau Tr- or Spearman's rho p$ may appear as 
nice "average" measures of dependence. They are just single numbers, instead of a 
true 2-dimensional function like C„. Therefore, such a GOF test may be simply 



^0 : *K = Z K,C > 
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where Tjcq, = 4Ec [Q)(U)] — 1 is the Kendall's tau of the true copula Co, and Xk is 
an estimate of this measure of dependence, for instance its empirical counterpart 

2 

?Kn' = —, r [number of concordant pairs of observations — number of discordant pairs] . 

n(n— 1) 

Here, we can set T^ Tuu ■ = — Tq,) 2 , or T„ Tau := n(%K,n — ifc- ) 2 in the case 

of composite assumption. Clearly, the performances of all these tests in terms of 
power will be very different and there is no hope to get a clear hierarchy between 
all of them. Sometimes, it will be relevant to discriminate between several distribu- 
tions depending on the behaviors in the tails. Thus, some adapted summaries of the 
information provided by the underlying copula C are required, like tail-indices for 
instance (see ll68l e.g.). But in every case, their main weakness is a lack of conver- 
gence against a large family of alternatives. For instance, the previous test T„ au 
will not be able to discriminate between all copulas that have the same Kendall's 
tau Tk.Cq- In other words, this dimension reduction is probably too strong, most of 
the time: we reduce a ^-dimensional problem to a real number. It is more fruitful to 
keep the idea of generating a univariate process, i.e., going from a dimension d to a 
dimension one. This is the idea of Kendall's process (see below). 

Another closely related family of tests is based on the comparison between sev- 
eral parameter estimates. They have been called "moment-based" GOF test statistics 
(see l88l . BTI . ifTTI ). In their simplest form, assume a univariate unknown cop- 
ula parameter 6, and two estimation equations ("moments") such that m\ = ri(0) 
and »i2 = f"2(0) (one-to-one mappings). Given empirical counterparts of m^, 
k= 1,2, ll88l has proposed the copula GOF test 

T —<:=^i{ r -^n x )- r ,\m 2 )}. 

Typically, some estimating equations are provided by Kendall's tau and Spearman's 
rho, that have well-known empirical counterparts. Nonetheless, other estimates have 
been proposed, as the pseudo-maximum likelihood (also called "canonical maxi- 
mum likelihood"). To deal with multi-dimensional parameters 9, estimating equa- 
tions can be obtained by the equality between the hessian matrix and minus the 
expected outer product of the score function. This is the idea of White's specifica- 
tion test (see ||93ll ), adapted to copulas in ll76l . 



1.3.4 Kendall's process 

This is another and well-known example of dimension reduction related to copula 
problems. Let C be the copula of an arbitrary random vector X £ M. d . Define the 
univariate cdf 

K(t) :=P(C(U) </), Vf GR, 
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where, as usual, we set U = (Fi(X\), . . . ,7^(Xj)). The function K depends on C 
only. Therefore, this univariate function is a "summary" of the underlying depen- 
dence structure given by C. It is called the Kendall's dependence function of C. An 
empirical counterpart of K is the empirical Kendall's function 

4(f):=-fl(C„(U,)<f), 

with pseudo-observations Ui,...,U n . The associated Kendall's process is simply 
given by K„ = y/n(K n — K), or K„ = y/n(K n — K(9„, •)) when the true copula is 
unknown but belongs to a given parametric family. The properties of Kendall's pro- 
cesses has been studied in depth in J5], |49l , and l40l particularly. In the latter 
papers, the weak convergence of K„ towards a continuous centered Gaussian pro- 
cess in the Skorohod space of cadlag functions is proved, for convenient consistent 
sequences of estimates 9 n . Its variance-covariance function is complex and copula 
dependent. It depends on the derivatives of K w.r.t. the parameter 6 and the limiting 
law of y/n(0„ — Go). 

Then, there are a lot of possibilities of GOF tests based on the univariate func- 
tion K„ or the associated process K n . For instance, ||90l introduced a test statistics 
based on the L 2 norm of K„. To be specific, they restrict themselves to bivariate 
Archimedean copulas, but allow censoring. That is why their GOF test statistics 
T L2,KendM = ji |jy2 invo i ves an arbitrary cut-off point % > 0. Nonetheless, the 
idea of such a statistics is still valid for arbitrary dimensions and copulas. It has 
been extended in |40l , that considers 

T L2,Kendall ._ f |K„(f )| 2 /fc(0„,f) dt , and Tl f s - Kenda " := sup \K„(t)\, 
JO , 6 [0,i] 

where k(6,-) denotes the density of C(U) w.r.t. to the Lebesgue measure (i.e. the 
derivative of K), and 0„ is a consistent estimate of the true parameter under the null. 

Nonetheless, working with K„ or K n instead of C„ or C„ respectively is not the 
panacea. As we said, the dimension reduction is not free of charge, and testing J^q 
instead of Jfo reduces the ability to discriminate between copula alternatives. For 
instance, consider two extreme-value copulas C\ and C?, i.e., in the bivariate case, 

Cj(u,v) = exp ( ln M A Xj^)) . J = l > 2 > 

for some Pickands functions A\ and Ai (convex functions on [0,1], such that 
max(f,(l —t)) < Aj(t) < 1 for all f g [0,1]). As noticed in (49J, the associated 
Kendall's functions are 



Kj(t)=t-(l-TKj)t\nt,te(f},l), 
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where Tkj denotes the Kendall's tau of Cj. Then, if the two Kendall's tau are the 
same, the corresponding Kendall's functions K\ and K2 are identical. Thus, a test 
of J^q : K — Kq will appear worthless if the underlying copulas are of the extreme- 
value type. 

In practice, the evaluation of the true Kendall function Kq under the null may 
become tedious, or even unfeasible for a lot of copula families. Therefore, (9 1 pro- 
posed to apply the previous Kendall process methodology to random vectors ob- 
tained through a PIT in a preliminary stage, to "stabilize" the limiting law under 
the null. In this case, Kq is always the same: the Kendall function associated to the 
independence copula C±. This idea has been implemented in P31 , under the form 
of Cramer-von-Mises GOF test statistics of the type 

T CvM,PIT ._ n I (Dn(u) _ c±(u)) 2 dDn{u) = £ ( Dn{tj) _ C±( u.)) 2 , 

were D„(u) = n Yd=\ l(Uj < u) is the empirical cdf associated to the pseudo- 
observations of the sample. Nonetheless, the limiting behavior of all these test statis- 
tics are not distribution-free for composite zero assumptions, and limiting laws have 
to be evaluated numerically by Monte-Carlo methods (as usual). 

Note that 11771 have proposed a similar idea, but based on Spearman's dependence 
function L instead of Kendall's dependence function. Formally, L is defined by 




, Vm G [0, 1 



When working with a random sample, the empirical counterpart of L is then 

L»:=i£l(C ± (U,-)<«), 

and all the previous GOF test statistics may be applied. For instance, |8 ( proposed 
to use the Cramer-von-Mises statistic 

T n L > CvM := f o (L n -L^ 2 L n {du), 

where L(9) is the Spearman's dependence function of an assumed copula Cq, and 
Q„ is an estimate of the true parameter under the zero assumption. 



1.4 GOF tests for some particular classes of copulas 

Beside omnibus GOF tests, there exist other test statistics that are related to par- 
ticular families of copulas only. We will not study such GOF tests when they are 
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related to particular finite-dimensional parametric families (to decide whether Co is 
a Gaussian copula, for instance). Nonetheless, in this section, we will be interested 
in a rather unusual GOF problem: to say whether Co belongs to a particular infinite- 
dimensional parametric family of copulas. Among such large families, some of them 
are important in practice: the Archimedean family, the elliptical one, extreme-value 
copulas, vines, hierarchical Archimedean copulas etc. 



1.4.1 Testing the Archimedeanity 

All the previously proposed test statistics can be applied when c € is an assumed par- 
ticular Archimedean family, as in l90l . ll83l ... Other test statistics, that are based 
on some analytical properties of Archimedean copulas, have been proposed too 
( ll52l . for instance). Interestingly, |46| proposed a graphical procedure for selecting 
a Archimedean copula (among several competitors), through a visual comparison 
between the empirical Kendall's function K„ and an estimated Kendall function ob- 
tained under a composite null hypothesis J^q. 

Now, we would like to test "Ji?o : C is Archimedean" against the opposite, i.e., 
without any assumption concerning a particular parametric family. This problem has 
not received a lot of attention in the literature, despite its practical importance. 

Consider first the (unknown) generator of the underlying bivariate copula 
C, i.e. C(u) = (j)- l ((j)(ui) + <j)(u 2 )) for every u = («i," 2 ) £ [0, l] 2 . |46) proved 
that Vi := 0(Fi(Xi))/{0(Fi(Xi)) + 0(F 2 (X 2 ))} is uniformly distributed on (0,1) 
and that V 2 := C(F\(X\)),F2(X2)) is distributed as the Kendall's dependence func- 
tion K(t) = t — (j)(t)/<j)'(t). Moreover, V\ and V 2 are independent. Since K can be 
estimated empirically, these properties provide a way of estimating <p itself (by 
(j)„). Therefore, as noticed in the conclusion of (46), if the underlying copula is 
Archimedean, then the r.v. 

Vi := ^(Fi i „(X 1 ))/{^(Fi iB (Zi))+^(F 2i „(X 2 ))} 

should be distributed uniformly on (0, 1) asymptotically. This observation can lead 
to some obvious GOF test procedures. 

Another testing strategy starts from the following property, proved in |68l : 
a bivariate copula C is Archimedean iff it is associative (i.e. C(«i,C(m 2 ,M3)) = 
C(C(mi,m 2 ),M3) for every triplet (mi,m 2 ,«3) in [0, l] 3 ) and satisfies the inequality 
C(u,u) < u for all u <E (0, 1). This property, known as Ling's Theorem (see [64|), 
has been extended in an arbitrary dimension d > 2 by |f89l . Then, ll56l proposed to 
test the associativity of C to check the validity of the Archimedean zero assumption. 
For every couple (u\ , m 2 ) in (0, l) 2 , he defined the test statistics 

,^ J n (u\,u2) := \/n{C n (ui,C n (u2,U2))—C n (C n (ui,U2),U2)} ■ 
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Despite its simplicity, the latter pointwise approach is not consistent against a large 
class of alternatives. For instance, there exist copulas that are associative but not 
Archimedean. Therefore, |[T5l revisited this idea, by invoking fully the previous 
characterization of Archimedean copulas. To deal with associativity, they introduced 
the trivariate process 

&n(ui,U2,U3) := y/n{C n (lll,C n (u2,U3)) - C n (C n (m,U2),U 3 )} , 

andprovedits weak convergence in F°([0, l] 3 ). Cramer-von-Mises T^ vM andKolmog 
Smirnov T t f s test statistics can be build on 3~ n . To reject associative copulas that 
are not Archimedean, these statistics are slightly modified to get 



for some chosen constant ae (0,1 /2) and some increasing function y/, y/(0) = 0. 
Therefore, such final tests are consistent against all departures from Archimedeanity. 

Unfortunately, the two previous procedures are limited to bivariate copulas, and 
their generalization to higher dimensions d seems to be problematic. 



1.4.2 Extreme-value dependence 

As we have seen previously, bivariate extreme-value copulas are written as 



for every u, v in (0, 1), where A : [0, 1] — > [1 /2, 1] is convex and satisfies max(f , 1 — 
t) < A(t) < 1 for every t € [0,1]. Therefore, such copulas are fully parameterized by 
the so-called Pickands dependence function A, that is univariate. Extreme-value cop- 
ulas are important in a lot of fields because they characterize the large-sample lim- 
its of copulas of componentwise maxima of strongly mixing stationary sequences 
((26), l53ll . and the recent survey l50l ). Then, it should be of interest to test whether 
whether the underlying copula can be represented by (11.21 i, for some unspecified 
dependence function A. 

Studying the Kendall's process associated to an extreme-value copula C, l48l 
have noticed that, by setting W := C{U\ , U 2 ), we have K(t) = P{W < t) = t - (1 - 
T)fln(f), for every t G (0,1), where T is the underlying Kendall's tau. Moreover, 
they show that the moments of W are E [W] = {ix + 1)/ (i + 1 ) 2 , for all ; > 1 . There- 
fore, under Jf , -1 + SE[W] - 9E[W 2 } = 0. Then they proposed a test (that the 
underlying copula is extreme-value) based on an empirical counterpart of the latter 
relation: set 





(1.2) 
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where /y := l(X iA < Xj. u X ia < X hZ ), for all i,j G {1, . . .,n}. Under J#° , the latter 
test statistic is asymptotically normal. Its asymptotic variance has been evaluated 
in (7). 11781 has provided extensions of this idea towards more higher order moments 
ofW. 

These approaches rely on the so-called "reduction of dimension" techniques (see 
Section...). To improve the power of GOF tests, it would be necessary to work in 
functional spaces, i.e. concentrate on empirical counterparts of extreme-value cop- 
ulas, or, equivalently, of the functions A themselves. For instance, l78l proposed a 
Cramer-von-Mises GOF test, based on the Kendall's function K above. More gener- 
ally, several estimates of the Pickands dependence function are available, but most 
of them rely on the estimation of marginal distributions: see section 9.3 in J6l or 12. 
Nonetheless, ll42l have built "pure" copula GOF test statistics, i.e. independent from 
margins, by invoking empirical counterparts of the Pickands function introduced 
in l47l : given our previous notations, 

1 . define the pseudo-observations 



2. define the r.v. Si := In C/,- and 7) := — InV,; 

3. for every i = l,...,n, set f (0) := §i, and := %. Moreover, for every t e 



where y denotes the Euler constant. 

The two latter estimates are the "rank-based" version of those proposed in f75l 
and lTl6ll respectively. 

There is an explicit one-to-one mapping between A p n (resp. A^ FG ) and the empir- 
ical copula C„. Therefore, after endpoint corrections, l47l have exhibited the weak 
limit of the corresponding processes A„ := \/n{A„ — A) and A„ := ^fn(A GFG — 
A). Working with the two latter processes instead of C«, a lot of GOF tests can be 
built. For instance , |42l have detailed an Anderson-Darling type test based on the 
L 2 norm of and A CFG , even under composite null assumptions. 



In the same vein, another strategy has been proposed in 11611 : there is an equiv- 
alence between extreme-value copula C and max-stable copulas, i.e. copulas for 



nF n!l (X iA )/(n+l),9,:=nF n!l (X it2 



(0,1), set 




4. Two estimates of A are given by 
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which C(u) r = C(u' ), for every u G [0,l] d and r G R + . By setting 0„(u) := 
y/n({Cn(u 1/r )} r - C„(u)), for all u G [0, and every r > 0, ED have built some 
tests based on the limiting law of the joint process (D„ jr , , . . . ,D„, r ) for an arbitrary 
integer 



7.4.3 Pair-copula constructions 

In the recent years, a lot of effort has been devoted to the construction of d- 
dimensional copulas, d > 2, as combinations of several 2-dimensional copulas. 
Some authors have enriched the Archimedean copula class: Hierarchical, nested or 
multiplicative Archimedean copulas. Among others, see j57l . ll94l . ll67l . ll82l . ll69l . 
Other authors have studied the large class of vines: D-vines, C-vines, regular vines 
more generally (see |fl~), 11201 . e.g.). Inference, simulation and specification tech- 
niques have made significant progress to deal with these families of models & ' . 
This advances provide large classes of very flexible copulas. 

We will not discuss in depth the way of choosing the best Hierarchical Archimedean 
copula or the best D-vine, for a given data. Apparently, every proposition in this 
stream of the literature follows the same steps: 

(i) Assume an underlying class of models & (D-vine, for instance); 

(ii) Choose the potential bivariate families of copulas that may appear in the con- 
struction; 

(iii) Evaluate the best structure (a network, or a tree), and estimate the associated 
bivariate copulas (simultaneously, in general). 

Mathematically, we can nest this methodology inside the previous general GOF 
copula framework detailed above. Indeed, the copula candidates belong to a finite 
dimensional parametric family, even if the dimension of the unknown parameter 6 
can be very large. Obviously, authors have developed ad-hoc procedures to avoid 
such a violent approach of GOF testing: see 12TI or ||29l for vine selection, for 
instance. 

At the opposite, there is no test of the slightly different and more difficult GOF 
problem 

J%?0 : C belongs to a given class & . 

For instance, a natural question would be to test whether an underlying copula be- 
longs to the large (and infinite dimensional!) class of Hierarchical Archimedean 
copulas. To the best of our knowledge, this way of testing is still a fully open prob- 
lem. 
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1.5 GOF copula tests for multivariate time series 

One limiting feature of copulas is the difficulty to use them in the presence of multi- 
variate dependent vectors (X„)„ e z, with X„ € Mf 1 . In general, the "modeler problem" 
is to specify the full law of this process, i.e., the joint laws (X ni , . . . ,X n ) for every 
p and every indices n\,...,n p and in a consistent way. Applying the copula ideas to 
such a problem seems to be rather natural (see l74l for a survey). Nonetheless, even 
if we restrict ourselves to stationary processes, the latter task is far from easy. 

A first idea is to describe the law of the vectors (X m ,X m +i , . . . ,X„) with cop- 
ulas directly, for every couple (m,n), m < n. This can be done by modeling sepa- 
rately (but consistently) d(n — m + 1) unconditional margins plus a d(n — m + 1)- 
dimensional copula. This approach seems particularly useful when the underlying 
process is stationary and Markov (see ifTTl for the general procedure). But the con- 
ditions of Markov coherence are complex (see l55l ). and there is no general GOF 
strategy in this framework, to the best of our knowledge. 

A more usual procedure in econometrics is to specify a multivariate time-series 
model, typically a linear regression, and to estimate residuals, assumed serially in- 
dependent: see 0~8), that deals with a GARCH-like model with diagonal innova- 
tion matrix. They showed that estimating the copula parameters using rank-based 
pseudo-likelihood methods with the ranks of the residuals instead of the (non- 
observable) ranks of innovations, leads to the same asymptotic distribution. In par- 
ticular, the limiting law of the estimated copula parameters does not depend on the 
unknown parameters used to estimate the conditional means and the conditional 
variances. This is very useful to develop goodness-of-fit tests for the copula family 
of the innovations. Il79l extended these results: under similar technical assumptions, 
the empirical copula process has the same limiting distribution as if one would have 
started with the innovations instead of the residuals. As a consequence, a lot of tools 
developed for the serially independent case remain valid for the residuals. However, 
that is not true if the stochastic volatility is genuinely non-diagonal. 

A third approach would be to use information on the marginal processes them- 
selves. This requires to specify conditional marginal distributions, instead of uncon- 
ditional margins as above in the first idea. This would induce a richer application of 
the two-step basic copula idea i.e., use "standard" univariate processes as inputs of 
more complicated multivariate models: 

1. for every j = 1, . . . ,d, specify the law ofX n j knowing the past values X„-ij,X n -2j, 

2. specify (and/or estimate) relevant dependence structures, "knowing" these uni- 
variate underlying processes, to recover the entire process (X n )nez- 

Using similar motivations, Patton ( l72l . f73l ) introduced so-called conditional 
copulas, which are associated with conditional laws in a particular way. Specifically, 
let X = (Xi, . . . ,Xj) be a random vector from (Q,s/q,F) to M. d . Consider some 
arbitrary sub-c-algebra si C s/q. A conditional copula associated to (X,si) is a 
^([0, \] d ) ® si measurable function C such that, for any x\,... e M, 



20 



Jean-David Fermanian 



P(X<x|«e/) = C{P{X { <x 1 \s/),...,F(X d <x d \ff/)\s/}. 

The random function C(-\&/) is uniquely defined on the product of the values taken 
by Xj i— > ¥(Xj < xj | £/)((£>), j — 1, . . . ,d, for every realization CO £ srf ', As in the 
proof of Sklar's theorem, C( |A) can be extended on [0, 1]'' as a copula, for every 
conditioning subset of events A C sd , 

In Patton's approach, it is necessary to know/model each margin, knowing all 
the past information, and not only the past observations of each particular margin. 
Nonetheless, practitioners often have good estimates of the conditional distribution 
of each margin, conditionally given its own past, i.e., V(X n j < xj\s/„j), j = 1, . . . ,d, 
by setting n .j = G{X n -ij,X n -2j, ■■■)■ To link these quantities with the (joint) law 
of X„ knowing its own past, it is tempting to write 

P(X„ < xK„) = C* {P(Zl,„ < Xi K„,!), . . . ,f(X dj „ < X d \^ ntd )} , 

for some random function C* : [0, \] d — > [0, 1] whose measurability would depend 
on srf ' n and on the stf n ;, j = 1, . . . ,d. Actually, the latter function is a copula only 
if the process (X^k ^ j) n &L does not "Granger-cause" the process (Xj in ) n£ z, for 
every j = l,...,d. This assumption that each variable depends on its own lags, but 
not on the lags of any other variable, is clearly strong, even though it can be ac- 
cepted empirically; see the discussion in 1741 , pp. 772-773. Thus, ||3"71 has extended 
Patton's conditional copula concept, by defining so-called pseudo-copulas, that are 
simply cdf on [0, l] d with arbitrary margins. They prove: 

Theorem 1.5.1. For any sub-algebras 3§,s/\,...,si d such that si j C SS, j = 
l,...,d, there exists a random function C : [0, l] d x £2 — > [0, 1] such that 

P(X < x | 38){co) = C{V{Xi <xi | s/i)(a>),...,P(X d <x d | s/ d )(co), 0)} 
= C{P{Xi < Xl | F(X d <x d | </)}(«), 

for every x = (xi,...,x d ) € M £/ and almost every CO £ £2. This function C is 
^([0, l] d ) <S> measurable. For almost every CO £ Q, C(-, ft)) is a pseudo-copula 
and is uniquely defined on the product of the values taken by xj M> P(X/ <xj \ s/ j)(co), 
j = 1, 

If C is unique, it is called the conditional (si, i^)-pseudo-copula associated with 
X and denoted by C(-\s/,&). Actually, C(-| si ,38) is a copula iff 

V(Xj < xj | S§) = P(Xj < xj | s/j) a.e. (1.3) 

for all j = l,...,d and x G R d . This means that SB cannot provide more informa- 
tion about Xj than si j, for every j. Patton's conditional copula corresponds to the 
particular case M = si\ = ■ ■ ■ = si d , for which ( 11.3b is clearly satisfied. 

One key issue is to state if pseudo-copulas depend really on the past values of the 
underlying process, i.e., to test their constancy, an assumption often made in prac- 
tice. In 071 . they estimate nonparametrically conditional pseudo-copulas, including 
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Patton's conditional copulas as a special case, and test their constancy with respect 
to their conditioning subsets. Here, we specify their technique. 

For a stationary and strongly mixing process (X„) n€ z, we restrict ourselves to 
conditional sub-algebras si n and SB n that are defined by a finite number of past 
values of the process, typically (X„_i ,X„_2, • • • ,X„_ p ) for some p > 1. The de- 
pendence of srf and S3 with respect to past values y will be implicit hereafter. For- 
mally, j37l consider the test of several null hypothesis: 

(a) 

: For every y, C(- \ *(,&)= C (-), 

against 

' a : For some y, C(- ^ C (-), 

where Co denotes a fixed pseudo-copula function. In this case, J^q means that 
the underlying conditional ( I e/,^)-pseudo-copula is in fact a true copula, inde- 
pendent of the past values of the process. 

(b) 

(2) 

J^q : There exists a parameter 6q such that C(- 0$) = Cg £ for every y, 
where c <§ = {Cq, £ 0} denotes some parametric family of pseudo-copulas. 

(c) 

(3) 

,y^> : For some function 0(y) = d(srf,£§), we have 
C{\si,3S) =C fl(y ) G c €, for every y. 

The latter assumption says that the conditional pseudo-copulas stay inside the 
same pre-specified parametric family of pseudo-copulas (possibly copulas), for dif- 
ferent observed values in the past. ll37l proposed a fully nonparametric estimator 
of the conditional pseudo-copulas, and derived its limiting distribution. This pro- 
vides a framework for "brute-force" GOF tests of multivariate dynamic dependence 
structures (conditional copulas, or even pseudo-copulas), similarly to what has been 
done in section [L2l 

ll37l stated the equivalent of the empirical processes C„ or C„. Use the short-hand 
notation X", for the vector (X m ,X m+ i, . . . ,X„). Similarly, write X* ■ = (X m j,. . . ,X n j). 

Assume that every conditioning set £? n ,j (resp. 3§ n ) is related to the vector XJJll, ,■ 
(resp. Xjjlp). Specifically, consider the events (X"Ip=y*) £ 0§ n , with y* = (y u ...,y p ), 
and (X£zl ■ = yj) G J^ n ,j> with y*- = (yij, . . . ,y P j)- Their nonparametric estimator 
of the pseudo-copula is based on a standard plug-in technique that requires estimates 
of the joint conditional distribution 

m(x|y*)=p(x p <x|Xr 1 =y*), 



and of conditional marginal cdf's 
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mj (xj | y*) = P (x pJ < xj | Xgj 1 = y}) , j = 1,.. . ,d. 

Let f„ ; be the (marginal) empirical distribution function of Xj, based on the 
(Xij, ■ ■ ■ ,X n j). For convenient kernels K and K, set 

For every x G R d and y* G K p , estimate the conditional distribution m(x | y*) = 
p(x / ,<x|X^ 1 =y*) by 

m„(x | y*) = — "f ^(xf p - X )l(X, +p < x), 

where 

^n(X^ +p ') = Xft{F„i(iQi) — F„i(yn), . . . ,F nd {X u ) — F„ d (y ld )i . . . , 

■ ■ ■ ,fnl - F„\(y p l), ■ ■ ■ , F nd( X (e+p-i),d) - F nd(y P d)}- 

Similarly, for all xj G M and y* G R p , the conditional marginal cdf's tnj (xj \ y*A is 
estimated in a nonparametric way by 

1 _ 

m nJ (xj | y*) = — — £ K- h { F nj( x i,j)- F nj(yij),- ■ ■ , 
n p (=l 

F nj( x e+ P -i,j) - F nj(y P j)}l(Xe+ P ,j < xj), 

for every j = 1 , . . . ,d. ||37l proposed to estimate the underlying conditional pseudo- 
copula by 

C(u | K-l = f) = Mm^im I yj),. ■ ■ I y5) I y*}, 

with the use of pseudo-inverse functions. Then, under for all u G [0, l] d and 

^?{C(u | S£f = y*) - C (u)} A ^[0,cr(u)] 
as n — >• oo, where cr(u) = Co(u){l — Co(u)} / /T 2 (v)dv. This result can be extended 

(2) 

to deal with different vectors y* simultaneously, and with the null hypotheses 
andJ^ (3) :forallueR d , 

V / ^{C(u|yI)-C 6l (u),...,C(u|y;)-C a! (u)}-^^[0 J 2:(u,y;,...,y;)], 
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as « — » oo, where 

Z(u,yt,..., y ;)=diag^C e(yr) (u){l-C e(y p(u)}|/: 2 (v)dv, l<k<q), 

for some consistent estimators % such that % = 6(yt) + Op(n~ 1 / 2 ), k = l,...,q. 
Each kth term on the diagonal of E can be consistently estimated by 

a, 2 (u) =C 4 (u){l -C 4 (u)} I /: 2 (v)dv. 

Note that, in the corollary above, the limiting correlation matrix is diagonal be- 
cause we are considering different conditioning values y*, ... ,y* but the same ar- 
gument u. At the opposite, an identical conditioning event but different arguments 
Ui , U2, . . . would lead to a complex (non diagonal) correlation matrix, as explained 
in ll33l . The latter weak convergence result of random vectors allows the building of 
GOF tests as in section [L2l For instance, as in l33ll . a simple test procedure may be 

, i {C(u|X"^ = y*)-Q (u)} 2 

for different choices of u and conditioning values y|. Under , the term on the 
right-hand-side tends to a % 2 {q) distribution under the null hypothesis. Note that this 
test is "local" since it depends strongly on the choice of a single u. An interesting 
extension would be to build a "global" test, based on the behavior of the full process 

sfn^ d {C{-\X n -_[ = r k )-C k {-)}. 

But the task of getting pivotal limiting laws is far from easy, as illustrated in ||33ll . 

In practice, authors often restrict themselves to the case of time-dependent copula 
parameters instead of managing time-dependent multivariate cdfs' nonparametri- 
cally. For instance, every conditional copula or pseudo-copula is assumed to belong 
to the Clayton family, and their random parameters depend on the past obser- 
vations. (0 has proposed a non-parametric estimate &(■) of the function 6, in the 
case of a univariate conditioning variable. It seems possible to build some GOF tests 
based on this estimate and its limiting behavior, at least for simple null hypothesis, 
but the theory requires more developments. 



1.6 Practical performances of GOF copula tests 

Once a paper introduces one or several new copula GOF tests, it is rather usual 
to include an illustrative section. Typically, two characteristics are of interest for 
some tests in competition: their ability to maintain the theoretical levels powers, and 
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their power performances under several alternatives. Nonetheless, these empirical 
elements, even useful, are often partial and insufficient to found a clear judgement. 
Actually, only a few papers have studied and compared the performances of the 
main previous tests in depth. Indeed, the calculation power required for such a large 
analysis is significant. That is why a lot of simulation studies restrict themselves 
to bivariate copulas and small or moderate sample sizes (from n = 50 to n = 500, 
typically). The most extensive studies of finite sample performances are probably 
those of HI and B31I . In both papers, the set of tests under scrutiny contains the 
three main approaches: 

1. "brute-force" proposals like T t f s and/or T^ vM , as in section [L2l 

2. Kendall's process based tests; 

3. test statistics invoking the PIT (see section [T3I ). 

These works found that a lot of tests perform rather well, even for small samples 
(from n = 50, e.g.). Moreover, it is difficult to exhibit clear hierarchy among all of 
these tests in terms of power performances. As pointed out by l45l . 

No single test is preferable to all others, irrespective of the circumstances. 

In their experiments, |45l restricted themselves to bivariate copulas and small 
sample sizes n £ {50, 150}. The statistics based on Kendall's dependence function 
are promoted, particularly when the underlying copula is assumed to be Archimedean 
It appeared that Cramer- von-Mises style test statistics are preferable to Kolmogorov- 
Smirnov ones, all other things being equal, and whatever the possible transforma- 
tions of the data and/or the reductions of information. Among the tests based on 
a Cramer-von-Mises statistic, it is difficult to discriminate between the three main 
approaches. 

The latter fact is confirmed in (8j, that led some simulated experiments with 
higher dimensions d £ {2,4,8} and larger sample sizes n £ {100,500}. [jO observed 
the particularly good performances of a new test statistic, calculated as the average 
of the three approaches. Moreover, he studied to impact of the variables ordering in 
the PIT. Even if estimated p-values may be different, depending on which permuta- 
tion order is chosen, this does not seem to create worrying discrepancies. 

Notably ifTTI led an extensive simulated experiment of the same type, but their 
main focus was related to detecting small departures from the null hypothesis. Thus, 
they studied the asymptotic behavior of some GOF test statistics under sequences 
of alternatives of the type 

Jf ai „:C=(l-5 n )C + 5 n D, 

where 8 n = n~ x l 2 8, 8 > 0, and D is another copula. They computed local power 
curves and compared them for different test statistics. They showed that the es- 
timation strategy can have a significant impact on the power of Cramer-von-Mises 
statistics and that some "moment-based" statistics provide very powerful tests under 
many distributional scenarios. 
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Despite the number of available tests in the literature, the usefulness of all these 
procedures in practice has to be proved more convincingly. Apparently, some au- 
thors have raised doubts about the latter point. For instance, ||9T1 has evaluated the 
performances of Value-at-Risk or VaR (quantiles of loss) and Expected Shortfall 
or ES (average losses above a VaR level) forecasts, for a large set of portfolios of 
two financial assets and different copula models. They estimate static copula models 
on couples of asset return residuals, once GARCH(1,1) dynamics have been fitted 
for every asset independently. They applied three families of GOF tests (empirical 
copula process, PIT, Kendall's function) and five copula models. They found that, 

Although copula models with GARCH margins yield considerably better estimates than 
correlation-based models, the identification of the optimal parametric copula form is a seri- 
ous unsolved problem. 

Indeed, none of the GOF tests is able to select the copula family that yields the best 
VaR- or ES -forecasts. This points out the difficulty of finding relevant and stable 
multivariate dynamics models, especially related to joint extreme moves. But, such 
results highlight the fact that it remains a significant the gap between good per- 
formances with simulated experiments and trustworthy multivariate models, even 
validated formally by statistical tests. 

Indeed, contrary to studies based on simulated samples drawn from an assumed 
copula family (the standard case, as in B31 or J8]), real data can suffer from outliers 
or measurement errors. This is magnified by the fact that most realistic copulas are 
actually time-dependent (| 91]) and/or are mixtures or copulas ( l63l ). Therefore, l92l 
showed that even minor contamination of a dataset can lead to significant power 
decreases of copula GOF tests. He applied several outlier detection methods from 
the theory of robust statistics, as in l66l . before leading the formal GOF test of any 
parametric copula family. 11921 concluded that the exclusion of outliers can have a 
beneficial effect on the power of copula GOF tests. 
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